20 research outputs found

    Progressive-Hint Prompting Improves Reasoning in Large Language Models

    Full text link
    The performance of Large Language Models (LLMs) in reasoning tasks depends heavily on prompt design, with Chain-of-Thought (CoT) and self-consistency being critical methods that enhance this ability. However, these methods do not fully exploit the answers generated by the LLM to guide subsequent responses. This paper proposes a new prompting method, named Progressive-Hint Prompting (PHP), that enables automatic multiple interactions between users and LLMs by using previously generated answers as hints to progressively guide toward the correct answers. PHP is orthogonal to CoT and self-consistency, making it easy to combine with state-of-the-art techniques to further improve performance. We conducted an extensive and comprehensive evaluation to demonstrate the effectiveness of the proposed method. Our experimental results on six benchmarks show that combining CoT and self-consistency with PHP significantly improves accuracy while remaining highly efficient. For instance, with text-davinci-003, we observed a 4.2% improvement on GSM8K with greedy decoding compared to Complex CoT, and a 46.17% reduction in sample paths with self-consistency. With GPT-4 and PHP, we achieve state-of-the-art performances on SVAMP (91.9%), GSM8K (95.5%) and AQuA (79.9%).Comment: Tech Repor

    Aria-NeRF: Multimodal Egocentric View Synthesis

    Full text link
    We seek to accelerate research in developing rich, multimodal scene models trained from egocentric data, based on differentiable volumetric ray-tracing inspired by Neural Radiance Fields (NeRFs). The construction of a NeRF-like model from an egocentric image sequence plays a pivotal role in understanding human behavior and holds diverse applications within the realms of VR/AR. Such egocentric NeRF-like models may be used as realistic simulations, contributing significantly to the advancement of intelligent agents capable of executing tasks in the real-world. The future of egocentric view synthesis may lead to novel environment representations going beyond today's NeRFs by augmenting visual data with multimodal sensors such as IMU for egomotion tracking, audio sensors to capture surface texture and human language context, and eye-gaze trackers to infer human attention patterns in the scene. To support and facilitate the development and evaluation of egocentric multimodal scene modeling, we present a comprehensive multimodal egocentric video dataset. This dataset offers a comprehensive collection of sensory data, featuring RGB images, eye-tracking camera footage, audio recordings from a microphone, atmospheric pressure readings from a barometer, positional coordinates from GPS, connectivity details from Wi-Fi and Bluetooth, and information from dual-frequency IMU datasets (1kHz and 800Hz) paired with a magnetometer. The dataset was collected with the Meta Aria Glasses wearable device platform. The diverse data modalities and the real-world context captured within this dataset serve as a robust foundation for furthering our understanding of human behavior and enabling more immersive and intelligent experiences in the realms of VR, AR, and robotics

    FIMO: A Challenge Formal Dataset for Automated Theorem Proving

    Full text link
    We present FIMO, an innovative dataset comprising formal mathematical problem statements sourced from the International Mathematical Olympiad (IMO) Shortlisted Problems. Designed to facilitate advanced automated theorem proving at the IMO level, FIMO is currently tailored for the Lean formal language. It comprises 149 formal problem statements, accompanied by both informal problem descriptions and their corresponding LaTeX-based informal proofs. Through initial experiments involving GPT-4, our findings underscore the existing limitations in current methodologies, indicating a substantial journey ahead before achieving satisfactory IMO-level automated theorem proving outcomes

    TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models

    Full text link
    Automated theorem proving (ATP) has become an appealing domain for exploring the reasoning ability of the recent successful generative language models. However, current ATP benchmarks mainly focus on symbolic inference, but rarely involve the understanding of complex number combination reasoning. In this work, we propose TRIGO, an ATP benchmark that not only requires a model to reduce a trigonometric expression with step-by-step proofs but also evaluates a generative LM's reasoning ability on formulas and its capability to manipulate, group, and factor number terms. We gather trigonometric expressions and their reduced forms from the web, annotate the simplification process manually, and translate it into the Lean formal language system. We then automatically generate additional examples from the annotated samples to expand the dataset. Furthermore, we develop an automatic generator based on Lean-Gym to create dataset splits of varying difficulties and distributions in order to thoroughly analyze the model's generalization ability. Our extensive experiments show our proposed TRIGO poses a new challenge for advanced generative LM's including GPT-4 which is pre-trained on a considerable amount of open-source formal theorem-proving language data, and provide a new tool to study the generative LM's ability on both formal and mathematical reasoning.Comment: Accepted by EMNLP 2023. Code is available at https://github.com/menik1126/TRIG

    LEGO-Prover: Neural Theorem Proving with Growing Libraries

    Full text link
    Despite the success of large language models (LLMs), the task of theorem proving still remains one of the hardest reasoning tasks that is far from being fully solved. Prior methods using language models have demonstrated promising results, but they still struggle to prove even middle school level theorems. One common limitation of these methods is that they assume a fixed theorem library during the whole theorem proving process. However, as we all know, creating new useful theorems or even new theories is not only helpful but crucial and necessary for advancing mathematics and proving harder and deeper results. In this work, we present LEGO-Prover, which employs a growing skill library containing verified lemmas as skills to augment the capability of LLMs used in theorem proving. By constructing the proof modularly, LEGO-Prover enables LLMs to utilize existing skills retrieved from the library and to create new skills during the proving process. These skills are further evolved (by prompting an LLM) to enrich the library on another scale. Modular and reusable skills are constantly added to the library to enable tackling increasingly intricate mathematical problems. Moreover, the learned library further bridges the gap between human proofs and formal proofs by making it easier to impute missing steps. LEGO-Prover advances the state-of-the-art pass rate on miniF2F-valid (48.0% to 57.0%) and miniF2F-test (45.5% to 47.1%). During the proving process, LEGO-Prover also manages to generate over 20,000 skills (theorems/lemmas) and adds them to the growing library. Our ablation study indicates that these newly added skills are indeed helpful for proving theorems, resulting in an improvement from a success rate of 47.1% to 50.4%. We also release our code and all the generated skills

    MPC-based path following design for automated vehicles with rear wheel steering

    No full text
    Many studies have been recently exploited to discuss the path following control algorithms for automated vehicles using various control techniques. However, path following algorithm considering the possibility of automated vehicles with rear wheel steering (RWS) is still less investigated. In this study, we implemented nonlinear model predictive control (NMPC) on a passenger vehicle with active RWS for path following. The controller was compared to two other variations of NMPC where the rear steering angle is proportional to the front or fixed to zero. Simulation results suggested that the proposed controller outperforms the other two variations and the baseline controllers (Stanley and LQR) in terms of accuracy and responsiveness.Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Intelligent Vehicle

    Research progress in nitrogen-doped electrically conductive silicon carbide ceramics

    No full text
    The electrically conductive silicon carbide (SiC) ceramics that can be machined by electrical discharge machining, can not only overcome the highlight shortcomings of traditional high resistivity-grade SiC ceramics in machinability, but also maintain its other excellent properties. It has outstanding advantages to replace traditional high resistivity-grade SiC ceramics in the field of structural ceramics. In this paper, the nitrogen doping principle of electrically conductive SiC ceramics was illustrated, and then the powder sintering methods, sintering additives, thermoelectric and mechanical properties were summarized. Meanwhile, in order to provide guidance for the control of electrical properties, the electrical properties-related factors were discussed. In the end, the main challenges of nitrogen-doped electrically conductive SiC ceramics were pointed out, and the future interests were suggested to focus on the development of new sintering technology and additive, as well as clarifying the control mechanism of electrical properties, thereby establishing the technical foundation for fabrication of high-performance conductive SiC ceramics with controllable electrical resistivity

    Characterization of SiC Ceramic Joints Brazed Using Au–Ni–Pd–Ti High-Temperature Filler Alloy

    No full text
    In this work, (Au79Ni17Pd4)96Ti4 (wt.%) filler alloy was designed and employed to join SiC ceramics. The effects of brazing temperature and soaking time on the microstructure and fracture morphology of joints were investigated. The results show that the joint obtained can be described as SiC/reaction layer/braze/reaction layer/SiC. The reaction layer was composed of TiC and Au (Si, Ti). The wettability of the filler alloy toward the SiC ceramics was analyzed. The braze zone was mainly constituted by Pd2Si, Ni2Si, and Au (Ni, Si). A large number of nano-sized TiC particles were distributed within the Au (Ni, Si) layer. The formation mechanism of the braze containing different phases was discussed. The brazing temperature and soaking time had a significant effect on the reaction layer at the SiC/braze interface and TiC particles within the Au (Ni, Si) layer, while they showed a negligible effect on the Pd2Si and Ni2Si within the braze. The inherent reason was also clarified in detail. The joint fractography indicated that a good bonding was achieved between the filler alloy and SiC, while joint fracture was primarily induced by the thermal stresses residing after the brazing cycle
    corecore